Applying pitch-dependent difference detection and modification to emotional speaker recognition

نویسندگان

Ting Huang

Yingchun Yang

چکیده

Emotion is an internal source, which can cause the speaker recognition system performance degradation by inducing extra intra-speaker vocal variability. Several enhancements have been applied to speaker recognition system under emotional speech. However, these methods suffer from the limitation of requiring the emotional speech in training or the emotion state of the speaker in testing. This paper presents a novel approach based on the Pitch-dependent Difference Detection and Modification (PDDM) to overcome the limitation above. In this method, only the neutral speech is used to train the speaker models and the emotional state information is not needed in the testing. Experimental results on MASC show that this method enhances identification rate by 4.7% in the best case compared to the traditional speaker recognition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MFCC based Enlargement of the Training Set for Emotion Recognition in Speech

Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, denominated as “prosody”, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficu...

متن کامل

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Setup of an emotion recognition or emotional speech recognition system is directly related to how emotion changes the speech features. In this research, the influence of emotion on the anger and happiness was evaluated and the results were compared with the neutral speech. So the pitch frequency and the first three formant frequencies were used. The experimental results showed that there are lo...

متن کامل

Synthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition

Emotional state recognition through speech is being a very interesting research topic nowadays. Using subliminal information of speech, it is possible to recognize the emotional state of the person. One of the main problems in the design of automatic emotion recognition systems is the small number of available patterns. This fact makes the learning process more difficult, due to the generalizat...

متن کامل

Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features

One major source of performance decline in speaker recognition system is channel mismatch between training and testing. This paper focuses on improving channel robustness of speaker recognition system in two aspects of channel compensation technique and channel robust features. The system is text-independent speaker identification system based on two-stage recognition. In the aspect of channel ...

متن کامل

Applying Score Reliability Fusion to Bi-Model Emotional Speaker Recognition

Emotion mismatch between training and testing is one of the important factors causing the performance degradation of speaker recognition system. In our previous work, a bi-model emotion speaker recognition (BESR) method based on virtual HD (High Different from neutral, with large pitch offset) speech synthesizing was proposed to deal with this problem. It enhanced the system performance under m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Applying pitch-dependent difference detection and modification to emotional speaker recognition

نویسندگان

چکیده

منابع مشابه

MFCC based Enlargement of the Training Set for Emotion Recognition in Speech

Statistical Variation Analysis of Formant and Pitch Frequencies in Anger and Happiness Emotional Sentences in Farsi Language

Synthetical Enlargement of Mfcc Based Training Sets for Emotion Recognition

Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features

Applying Score Reliability Fusion to Bi-Model Emotional Speaker Recognition

عنوان ژورنال:

اشتراک گذاری